Data insights platform for a security and compliance environment

ABSTRACT

A multi-purpose platform may collect different types of signals such as metadata, documents, activities, etc. and correlate in a multi-stage evaluation framework in order to allow simple queries from components and clients of a compliance and security environment to be converted into rich analyses on available data. Various signals may be collected from tenant environment and correlated at multiple levels based on their content and context. Queries from components such as a threat intelligence manager, a data explorer module, or even clients of the system may be executed on the correlated data by focusing and/or filtering the queries based on the context, effectively converting a simple query to a comprehensive analysis. The platform may have intelligence to decide which type of data to run a query on based on the request and allow data investigations performing a chain-linked investigation that can go multiple levels deep.

CROSS-REFERENCE TO RELATED APPLICATION

This application claims the benefit under 35 U.S.C. § 119(e) of U.S. Patent Application No. 62/440,934 filed on Dec. 30, 2016. The U.S. patent application is herein incorporated by reference in its entirety.

BACKGROUND

Hosted services provided by tenants of service providers to their users, such as companies to their employees or organizations to their members, are an increasingly common software usage model. Hosted services cover a wide range of software applications and systems from cloud storage to productivity, and collaboration to communication. Thus, any number of users may utilize applications provided under a hosted service umbrella in generating, processing, storing, and collaborating on documents and other data.

Accuracy, efficiency, and effectiveness of security and compliance services that analyze, protect, and support a variety of hosted services can increase in proportion to the range and type of underlying data and analysis capabilities on such data. For example, checking only incoming emails or attachments for malicious threat can be very limiting and not catch actions of users or malware that has slipped through the defenses. Conventional services directed to security or compliance are typically single-dimensional and suffer results of those limitations.

SUMMARY

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This summary is not intended to exclusively identify key features or essential features of the claimed subject matter, nor is it intended as an aid in determining the scope of the claimed subject matter.

Embodiments are directed to a data insights platform for a security and compliance environment. A data insights platform associated with a hosted service may collect a plurality of signals from a plurality of resources within a tenant's hosted environment, where the collected plurality of signals are correlated at one or more levels based their content and context. The data insights platform may receive a query associated with the collected plurality of signals and focus/filter the query on a portion of the collected and correlated signals based on a context of the query in relation to the collected and correlated signals. The data insights platform may then reply to the query with a comprehensive analysis report.

These and other features and advantages will be apparent from a reading of the following detailed description and a review of the associated drawings. It is to be understood that both the foregoing general description and the following detailed description are explanatory and do not restrict aspects as claimed.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A through IC include display diagrams illustrating an example network environment where a system to provide a data insights platform for a security and compliance environment may be implemented;

FIGS. 2A and 2B include display diagrams illustrating components and interactions of a security and compliance service providing a data insights platform for a security and compliance environment;

FIG. 3 includes a display diagram illustrating conceptually inputs and outputs of a data insights platform in a security and compliance environment;

FIG. 4 includes a display diagram illustrating a data explorer dashboard in conjunction with a data insights platform for a security and compliance environment;

FIG. 5 includes a display diagram illustrating a threat intelligence dashboard in conjunction with a data insights platform for a security and compliance environment;

FIG. 6 is a networked environment, where a system according to embodiments may be implemented;

FIG. 7 is a block diagram of an example computing device, which may be used to provide a data insights platform for a security and compliance environment; and

FIG. 8 illustrates a logic flow diagram of a method to provide a data insights platform for a security and compliance environment, arranged in accordance with at least some embodiments described herein.

DETAILED DESCRIPTION

As briefly described above, embodiments are directed to real time pivoting on data to model governance properties. In some examples, a multi-purpose platform may collect different types of signals such as metadata, documents, activities, etc. and correlate in a multi-stage evaluation framework in order to allow simple queries from components and clients of a compliance and security environment to be converted into rich analyses on available data. Various signals may be collected from tenant environment and correlated at multiple levels based on their content and context. Queries from components such as a threat intelligence manager, a data explorer module, or even clients of the system (tenant administrator, other hosted services) may be executed on the correlated data by focusing and/or filtering the queries based on the context, effectively converting a simple query to a comprehensive analysis. The platform may have intelligence to decide which type of data to run a query on based on the request and allow data investigations performing a chain-linked investigation that can go multiple levels deep.

In the following detailed description, references are made to the accompanying drawings that form a part hereof, and in which are shown by way of illustrations, specific embodiments, or examples. These aspects may be combined, other aspects may be utilized, and structural changes may be made without departing from the spirit or scope of the present disclosure. The following detailed description is therefore not to be taken in a limiting sense, and the scope of the present invention is defined by the appended claims and their equivalents.

While some embodiments will be described in the general context of program modules that execute in conjunction with an application program that runs on an operating system on a personal computer, those skilled in the art will recognize that aspects may also be implemented in combination with other program modules.

Generally, program modules include routines, programs, components, data structures, and other types of structures that perform particular tasks or implement particular abstract data types. Moreover, those skilled in the art will appreciate that embodiments may be practiced with other computer system configurations, including hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers, and comparable computing devices. Embodiments may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.

Some embodiments may be implemented as a computer-implemented process (method), a computing system, or as an article of manufacture, such as a computer program product or computer readable media. The computer program product may be a computer storage medium readable by a computer system and encoding a computer program that comprises instructions for causing a computer or computing system to perform example process(es). The computer-readable storage medium is a computer-readable memory device. The computer-readable storage medium can for example be implemented via one or more of a volatile computer memory, a non-volatile memory, a hard drive, a flash drive, a floppy disk, or a compact disk, and comparable hardware media.

Throughout this specification, the term “platform” may be a combination of software and hardware components for providing a data insights platform for a security and compliance environment. Examples of platforms include, but are not limited to, a hosted service executed over a plurality of servers, an application executed on a single computing device, and comparable systems. The term “server” generally refers to a computing device executing one or more software programs typically in a networked environment. However, a server may also be implemented as a virtual server (software programs) executed on one or more computing devices viewed as a server on the network. More detail on these technologies and example operations is provided below.

FIGS. 1A through IC include display diagrams illustrating an example network environment where a system to provide a data insights platform for a security and compliance environment may be implemented.

As illustrated in diagrams 100A-100C, an example system may include a datacenter 112 executing a hosted service 114 on at least one processing server 116, which may provide productivity, communication, cloud storage, collaboration, and comparable services to users in conjunction with other servers 120, for example. The hosted service 114 may further include scheduling services, online conferencing services, and comparable ones. The hosted service 114 may be configured to interoperate with a client application 106 through one or more client devices 102 over one or more networks, such as network 110. The client devices 102 may include a desktop computer, a laptop computer, a tablet computer, a vehicle-mount computer, a smart phone, or a wearable computing device, among other similar devices. In some examples, the hosted service 114 may allow users to access its services through the client application 106 executed on the client devices 102. In other examples, the hosted service 114 may be provided to a tenant (e.g., a business, an organization, or similar entities), which may configure and manage the services for their users.

In one embodiment, as illustrated in diagram 100A, the processing server 116 may be operable to execute a security and compliance application 118 of the hosted service 114, where the security and compliance application 118 may be integrated with the hosted service 114 to provide data management, security, threat management, data storage and processing compliance, and similar services. The security and compliance application 118 may include a data insights platform 122 configured to collect different types of signals from the hosted service 114 environment such as metadata, documents, activities, etc. and correlate in a multi-stage evaluation framework in order to allow simple queries from components and clients of the compliance and security application 118 to be converted into rich analyses on available data.

In another embodiment, as illustrated in diagram 100B, the security and compliance module 118 may be executed at a client device 102 in conjunction with the client application 106. The data insights platform 122 may still be within the hosted service 114 receiving and aggregating data and activities throughout the hosted service 114 and providing the above-mentioned services. In a further embodiment, as illustrated in diagram 100C, a separate protection service 126 may be executed by one or more processing servers 124 and include components like a data explorer or threat intelligence module 128 to deal with various aspects of security and data compliance services. The protection service 126 may be configured to serve the hosted service 114 and/or multiple applications associated with the hosted service 114, such as the client application 106. Furthermore, the protection service 126 may provide its services to multiple hosted services. Thus, if a tenant subscribes to multiple hosted services, common information (e.g., analysis results, user profiles, data and metadata) may be used to coordinate security operations, threat management, policy implementations, and other data management aspects. A data insights platform 122 may be executed by separate servers 120 and work in conjunction with both the protection service 126 and the hosted service 114. The data insights platform 122 may be a multi-purpose platform providing its data aggregation services with correlation and multi-stage evaluation to multiple hosted services/protection services. As described herein, the hosted service 114, the security and compliance application 118, the data insights platform 122, and the protection service 126 may be implemented as software, hardware, or combinations thereof.

As previously discussed, hosted services provided by tenants of service providers to their users are an increasingly common software usage model because it allows any number of users to utilize applications provided under the hosted service umbrella in generating, processing, storing, and collaborating on documents and other data. The usage of hosted services may include processing and storage or large amounts of data, which may be subject to regulatory, legal, industry, and other rules, internal and external threats, etc. Thus, it is a challenging endeavor for system administrators to determine different categories of data, applicable policies and rules for the categories, configure systems, and implement the applicable policies and take remediation actions. Implementation of a data insights platform for a security and compliance environment as described herein may allow tenants of a hosted service to understand their data, determine their security and compliance needs, configure their systems, implement new policies, and customize user interfaces in an efficient manner. Through these technical advantages, processing and network capacity may be preserved, data security may be enhanced, usability may be improved, and user interactivity may be increased.

Embodiments, as described herein, address a need that arises from a very large scale of operations created by software-based services that cannot be managed by humans. The actions/operations described herein are not a mere use of a computer, but address results of a system that is a direct consequence of software used as a service offered in conjunction with a large number of devices and users using hosted services.

FIGS. 2A and 2B include display diagrams illustrating components and interactions of a security and compliance service providing a data insights platform for a security and compliance environment.

Diagrams 200A and 200B show an example infrastructure for a comprehensive security and compliance service that may include among its components a data insights platform for aggregating data in a correlated and multi-stage evaluated manner. In some examples, data to be analyzed, categorized, protected, and handled according to policies may come from a variety of sources such as a communications data store 202, a collaboration data store 204, and cloud storage 206. On-premise data sources 208 may also contribute to the data to be processed. The data insights platform (correlated, multi-stage data storage service) 210 may receive stored data, activities associated with the data, and metadata, and correlate the data at multiple levels based on the activities and metadata. For example, a policy defining sharing or retention schedules for all word processing documents or all marketing documents may be an overkill and consume unnecessary resources, result in false positives, etc. In a system according to embodiments, the broader data types may be categorized based on specific aspects such as who is accessing the data, where the data is being accessed from, whether the document include sensitive information, etc. Policies and remediation actions may be determined according to these more granular categories allowing a more accurate and efficient handling and protection of the data.

The larger infrastructure may also include an alerts engine 212 to determine and issue alerts based on threats or unacceptable data usage, and a policy engine 214 to determine and implement retention, handling, and protection policies. As shown in diagram 200B, the correlated, multi-stage data storage may be utilized by a multitude of modules such as a threat intelligence module 230 to manage internal and external threats and data explorer module 226 to identify categories of data and determine applicable policies and remediation actions for the identified data. In some embodiments, the data explorer module 226 may be configured to receive attribute information such as a label, a sensitive data type, a data type, an age, a storage location of the data, a location of a user accessing the data, an identity of a user or an entity accessing the data, and whether the data is shared internally or externally for data stored in a correlated and multi-stage evaluated storage structure of for the hosted service. The attribute information may be generalized as classification, property, applied policy, and access. The data explorer module 226 may present a dashboard with one or more actionable visualizations representing distinct attributes of the data and upon receiving selections of attribute filters for the data through the dashboard, analyze the filtered data based on the received attribute information. The module may also determine a label, an applicable policy, and/or a remediation action for the data based on the analysis results. The determined label, applicable policy, and/or remediation action may be presented through the dashboard.

Based on the analysis, the data explorer module 226 may suggest a policy or remediation action to be implemented in some examples. The suggestion may be to customize or update a currently implemented policy or configuration. The suggestion may encompass regulatory, legal, industrial, internal compliance, external compliance, and other security and compliance rules or standards employed to protect the tenant, for example. User experiences such as threat intelligence user interface 232, alerts user interface 224, and policy user interface 222 may be provided as part of a security and compliance center 220 to present actionable visualizations associated with various aspects of the service and receive user/administrator input to be provided to the various modules. Various application programming interfaces (APIs) such as REST API may be used for exchange of communications among the different components of the system.

FIG. 3 includes a display diagram illustrating conceptually inputs and outputs of a data insights platform for a security and compliance environment.

In the example configuration of diagram 300, a data insights platform 310 may include a reporting framework 312, and aggregation store 314, and data insights API 318, where contextual searches 316 may be performed on the aggregated data (correlated and multi-stage evaluated) through the data insights API 318. The reporting framework may define and manage replies to queries. A background job framework 320 may perform tasks associated with alert, policy, threat intelligence aggregation 322. Other aggregation tasks may include reporting aggregation 326 and user experience data aggregation 328, which may manage customization insights 332 and tenant usage insights. The background job framework 320 may also perform default and system policy tasks 324.

Different types of signals such as metadata, documents, communications, activities, etc. may be collected from tenant environment and correlated in a multi-stage evaluation framework based on their content and context. The correlation and aggregation may be performed according to a component of the security and compliance service such as alert, policy, threat management aggregation. The system may allow simple queries from components and clients of the security and compliance service to be converted into rich analyses on available tenant data. The aggregated reports may be provided to the system components for alert notifications, threat management policies, data explorer classifications, etc. (302). The background job framework 320 may work with a security and compliance data store 308, which may also be used by policy and threat intelligence management services 306. The services may create workloads 304 for the system.

The collected signals may also include relationships (organizational), configurations (data, system, permissions, etc.), and comparable ones. In some cases, pre-correlated signals such as those from a graph-based data correlation system may also be received and used. New signals may be generated as signals are correlated at different levels. The data insights platform may have the intelligence to decide which type of data to run a query on based on the request and how to augment a query based on context.

The data insights platform may allow data investigations in some examples. A chain-linked investigation that can go multiple levels deep may be performed on various types of data. For example: a malware threat may be detected as an attachment to emails. Some emails may have been delivered to recipients prior to detection. Multi-stage investigation may determine who were the recipients; among the recipients, who opened their emails; among those who opened their emails, who opened the attachment, and correlate those levels with corresponding remediation actions. The investigation may be made even more comprehensive by adding context of which recipients are considered higher risk for the organization, which documents/content may be affected, etc.

The data insights platform may be used for pattern detection through the multi-stage evaluation, as well as, anomaly detection (e.g., query for alerts may be set as “Tell me which activities are abnormal”). Upon detecting a pattern associated with the data and a usage of the data based on the analysis, an insight may be derived, for example, for an applicable policy based on the pattern and the application policy presented as a suggested policy for the data based on the derived insight. The platform may also provide, in addition to the query results, raw or filtered signals from among the collected signals and/or any signals generated by the platform during routine correlation.

FIG. 4 includes a display diagram illustrating a data explorer dashboard in conjunction with a data insights platform for a security and compliance environment.

Diagram 400 illustrates an example dashboard through which actionable visualizations may be presented, actions/policies implemented, and monitored. As shown in the diagram, a client application may provide an administrator, for example, access to a user interface, such as a dashboard 402, associated with a data explorer module of a hosted service or a separate protection service. The dashboard 402 may present summary and/or detailed information associated with data categories, threats, security and compliance configurations, analyses results, and configuration controls, for example. Among other things, the dashboard 402 may comprise a plurality of tabs 404 that each offer one or more security and compliance-based features that may be managed by the tenant, administrators, and/or users through the dashboard 402. Example tabs 404 may include a home dashboard view, and additional views associated with threat analysis, alerts, security policies, data management, investigation, reports, global trends, and local trends.

In the data explorer view, users may be enabled to search data under various labels through a search box 406, and view/select actions 408, filters 410, etc. Various visualizations may include data by policy 412, data by label 414, sensitive data by type 416, access by location 418, data by age 420, and data sharing 422, for example. The visualizations may include graphic representations such as bar charts, pie charts, maps, and other representations employing a variety of highlighting, color, textual, graphic, and shading schemes. Some or all of the visualizations may be actionable, that is, a user may drill down on data by clicking on elements of the visualization, see details, change filtering parameters, change visualization parameters, etc. For example, a default data by label visualization may display a top 5 or 10 labels. Users may reduce or increase the number, change the graphic representation, etc. In some embodiments, users may be enabled to combine visualizations. For example, access by location visualization may be combined with sensitive data type or policy visualization such that a new visualization providing an intersection of the selected attributes may be presented.

The underlying data for the visualizations and other information displayed on the dashboard may be received from a data insights platform through a series of queries to the platform, which may collect signals from a number of resources within the tenant's hosted environment. The collected signals may be correlated at one or more levels based their content and context. The data insights platform may focus and/or filter the queries on a portion of the collected and correlated signals based on a context of the query in relation to the collected and correlated signals. The data insights platform may then reply to the query with a comprehensive analysis report.

FIG. 5 includes a display diagram illustrating a threat intelligence dashboard in conjunction with a data insights platform for a security and compliance environment.

As shown in diagram 500, a threat intelligence dashboard may provide visual information associated with current threats, protection status, and investigations with actionable items allowing selection of more detailed views, drill-down operations, and remediation actions. For example, dashboard 502 may present a user experience with visual and actionable information on potential threats and detected threats (global, industry level, regional, type, and other categories) using charts, lists, and/or maps (map or origination, affected areas, etc.). Through various schemes (color, shading, graphic, textual, etc.) correlation between internal and external threats may be displayed along with detailed information available through drill-down (i.e., user can click on any displayed data point and be provided individual data). The user experience may also indicate whether threats are directed to the organization (or a particular group/people within the organization) or are general. Automatic remediation actions and results may be displayed along with suggested actions.

Among other things, the dashboard 502 may comprise a plurality of tabs 504 that each offer one or more security and compliance-based features that may be managed by the tenant, administrators, and/or users through the dashboard 502. Example tabs 504 may include a home dashboard view, and additional views associated with threat analysis, alerts, security policies, data management, data discovery, investigation, reports, global trends, and local trends.

In the example dashboard 502, threat detections 505 presents various categories of threat detections graphically (e.g., scanned items, topped threats, removed threats, etc.). Detected threats 510 may present graphically and textually types of threats detected such as malware, viruses, phishing scams, etc. An attack origins map 508 may display a geographical map of where the detected threats originate from. A top targeted users list 514 may display a list of users who receive the most targeted threats. The displayed information may include the ability to drill down. For example, by selecting one of the users in the top targeted users list 514, an administrator may be able to see details of threats received by that user, documents or communications affected by the threats, and even follow a chain of events, that is, see other users who may be affected by the selected user through exchanged communication, shared documents, etc. The dashboard 502 may also display suggestions 512 providing policy or remediation action proposals in light of the threat analysis, and an investigations section 516 that may allow the administrator to perform searches on people, communications, documents, and other threat related topics. Audit trails may also be accessed through the investigations section 516.

The underlying data for some of the visualizations and other information displayed on the dashboard may be received from a data insights platform through a series of queries to the platform, which may collect signals from a number of resources within the tenant's hosted environment. The collected signals may be correlated at one or more levels based their content and context. The data insights platform may focus and/or filter the queries on a portion of the collected and correlated signals based on a context of the query in relation to the collected and correlated signals. The data insights platform may then reply to the query with a comprehensive analysis report.

The dashboards 402 and 502 are not limited to the above described components and features. Various graphical, textual, coloring, shading, and visual effect schemes may be employed to provide a dashboard based on data from a data insights platform for a security and compliance environment.

The examples provided in FIGS. 1A through 5 are illustrated with specific systems, services, applications, modules, and displays. Embodiments are not limited to environments according to these examples. A data insights platform for a security and compliance environment may be implemented in environments employing fewer or additional systems, services, applications, modules, and displays. Furthermore, the example systems, services, applications, modules, and notifications shown in FIG. 1A through 5 may be implemented in a similar manner with other user interface or action flow sequences using the principles described herein.

FIG. 6 is a networked environment, where a system according to embodiments may be implemented. A data insights platform as described herein may be employed in conjunction with hosted applications and services (for example, the client application 106 associated with the hosted service 114, or the protection service 126) that may be implemented via software executed over one or more servers 606 or individual server 608, as illustrated in diagram 600. A hosted service or application may communicate with client applications on individual computing devices such as a handheld computer 601, a desktop computer 602, a laptop computer 603, a smart phone 604, a tablet computer (or slate), 605 (‘client devices’) through network(s) 610 and control a user interface, such as a dashboard, presented to users.

Client devices 601-605 are used to access the functionality provided by the hosted service or client application. One or more of the servers 606 or server 608 may be used to provide a variety of services as discussed above. Relevant data may be stored in one or more data stores (e.g. data store 614), which may be managed by any one of the servers 606 or by database server 612.

Network(s) 610 may comprise any topology of servers, clients, Internet service providers, and communication media. A system according to embodiments may have a static or dynamic topology. Network(s) 610 may include a secure network such as an enterprise network, an unsecure network such as a wireless open network, or the Internet. Network(s) 610 may also coordinate communication over other networks such as PSTN or cellular networks. Network(s) 610 provides communication between the nodes described herein. By way of example, and not limitation, network(s) 610 may include wireless media such as acoustic, RF, infrared and other wireless media.

Many other configurations of computing devices, applications, engines, data sources, and data distribution systems may be employed to provide a data insights platform for a security and compliance environment. Furthermore, the networked environments discussed in FIG. 6 are for illustration purposes only. Embodiments are not limited to the example applications, engines, or processes.

FIG. 7 is a block diagram of an example computing device, which may be used to provide a data insights platform for a security and compliance environment.

For example, computing device 700 may be used as a server, desktop computer, portable computer, smart phone, special purpose computer, or similar device. In an example basic configuration 702, the computing device 700 may include one or more processors 704 and a system memory 706. A memory bus 708 may be used for communicating between the processor 704 and the system memory 706. The basic configuration 702 is illustrated in FIG. 7 by those components within the inner dashed line.

Depending on the desired configuration, the processor 704 may be of any type, including but not limited to a microprocessor (μP), a microcontroller (μC), a digital signal processor (DSP), or any combination thereof. The processor 704 may include one more levels of caching, such as a level cache memory 712, one or more processor cores 714, and registers 716. The example processor cores 714 may (each) include an arithmetic logic unit (ALU), a floating point unit (FPU), a digital signal processing core (DSP Core), or any combination thereof. An example memory controller 718 may also be used with the processor 704, or in some implementations the memory controller 718 may be an internal part of the processor 704.

Depending on the desired configuration, the system memory 706 may be of any type including but not limited to volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.) or any combination thereof. The system memory 706 may include an operating system 720, a protection application or service 722, and program data 724. The protection application or service 722 may include a data insights platform 726, which may be an integrated module of the protection application or service 722. The data insights platform 726 may be configured to collect a plurality of signals from a plurality of resources within a tenant's hosted environment, where the collected plurality of signals are correlated at one or more levels based their content and context. The data insights platform may receive a query associated with the collected plurality of signals and focus/filter the query on a portion of the collected and correlated signals based on a context of the query in relation to the collected and correlated signals. The data insights platform may then reply to the query with a comprehensive analysis report. The program data 724 may include, among other data, insights data 728 as described herein.

The computing device 700 may have additional features or functionality, and additional interfaces to facilitate communications between the basic configuration 702 and any desired devices and interfaces. For example, a bus/interface controller 730 may be used to facilitate communications between the basic configuration 702 and one or more data storage devices 732 via a storage interface bus 734. The data storage devices 732 may be one or more removable storage devices 736, one or more non-removable storage devices 738, or a combination thereof. Examples of the removable storage and the non-removable storage devices include magnetic disk devices such as flexible disk drives and hard-disk drives (HDDs), optical disk drives such as compact disk (CD) drives or digital versatile disk (DVD) drives, solid state drives (SSD), and tape drives to name a few. Example computer storage media may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data.

The system memory 706, the removable storage devices 736 and the non-removable storage devices 738 are examples of computer storage media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVDs), solid state drives, or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which may be used to store the desired information and which may be accessed by the computing device 700. Any such computer storage media may be part of the computing device 700.

The computing device 700 may also include an interface bus 740 for facilitating communication from various interface devices (for example, one or more output devices 742, one or more peripheral interfaces 744, and one or more communication devices 746) to the basic configuration 702 via the bus/interface controller 730. Some of the example output devices 742 include a graphics processing unit 748 and an audio processing unit 750, which may be configured to communicate to various external devices such as a display or speakers via one or more A/V ports 752. One or more example peripheral interfaces 744 may include a serial interface controller 754 or a parallel interface controller 756, which may be configured to communicate with external devices such as input devices (for example, keyboard, mouse, pen, voice input device, touch input device, etc.) or other peripheral devices (for example, printer, scanner, etc.) via one or more I/O ports 758. An example communication device 746 includes a network controller 760, which may be arranged to facilitate communications with one or more other computing devices 762 over a network communication link via one or more communication ports 764. The one or more other computing devices 762 may include servers, computing devices, and comparable devices.

The network communication link may be one example of a communication media. Communication media may typically be embodied by computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and may include any information delivery media. A “modulated data signal” may be a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media may include wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, radio frequency (RF), microwave, infrared (IR) and other wireless media. The term computer readable media as used herein may include both storage media and communication media.

The computing device 700 may be implemented as a part of a specialized server, mainframe, or similar computer that includes any of the above functions. The computing device 700 may also be implemented as a personal computer including both laptop computer and non-laptop computer configurations.

Example embodiments may also include methods to provide a data insights platform for a security and compliance environment. These methods can be implemented in any number of ways, including the structures described herein. One such way may be by machine operations, of devices of the type described in the present disclosure. Another optional way may be for one or more of the individual operations of the methods to be performed in conjunction with one or more human operators performing some of the operations while other operations may be performed by machines. These human operators need not be collocated with each other, but each can be only with a machine that performs a portion of the program. In other embodiments, the human interaction can be automated such as by pre-selected criteria that may be machine automated.

FIG. 8 illustrates a logic flow diagram of a method to provide a data insights platform for a security and compliance environment. Process 800 may be implemented on a computing device, server, or other system. An example server may comprise a communication interface to facilitate communication between one or more client devices and the server. The example server may also comprise a memory to store instructions, and one or more processors coupled to the memory. The processors, in conjunction with the instructions stored on the memory, may be configured to provide a data insights platform for a security and compliance environment.

Process 800 begins with operation 810, where a plurality of signals such as documents, communication, metadata, and activities may be collected from a plurality of resources within a tenant's hosted environment. The collected signals may be correlated at one or more levels based their content and context (e.g., documents based on metadata or activities associated with them).

At operation 820, a query associated with the collected plurality of signals may be received from a component or client of a security and compliance service such as a data explorer module or a threat intelligence module. The query may be focused on a portion of the collected and correlated signals or filtered based on a context of the query in relation to the collected and correlated signals at operation 830. At operation 840, the query may be replied to by the data insights platform with a comprehensive analysis report based on the focused/filtered execution of the query on the contextual portion of the signals.

The operations included in process 800 are for illustration purposes. A data insights platform for a security and compliance environment may be implemented by similar processes with fewer or additional steps, as well as in different order of operations using the principles described herein. The operations described herein may be executed by one or more processors operated on one or more computing devices, one or more processor cores, specialized processing devices, and/or general purpose processors, among other examples.

According to examples, a means for providing a data insights platform for a security and compliance environment is described. The means may include a means for collecting a plurality of signals from a plurality of resources within a tenant's hosted environment, where the collected plurality of signals are correlated at one or more levels based their content and context; a means for receiving a query associated with the collected plurality of signals; a means for focusing and filtering the query on a portion of the collected plurality of signals based on a context of the query in relation to the collected plurality of signals; and a means for replying to the query with a comprehensive analysis report based on the focused and filtered execution of the query on the portion of the collected plurality of signals.

According to some examples, a method to provide a data insights platform for a security and compliance environment is described. The method may include collecting a plurality of signals from a plurality of resources within a tenant's hosted environment, where the collected plurality of signals are correlated at one or more levels based their content and context; receiving a query associated with the collected plurality of signals; focusing and filtering the query on a portion of the collected plurality of signals based on a context of the query in relation to the collected plurality of signals; and replying to the query with a comprehensive analysis report based on the focused and filtered execution of the query on the portion of the collected plurality of signals.

According to other examples, the method may also include aggregating the plurality of signals in real time. The method may further include receiving the query from and replying to one or more of a data explorer module configured to identify and categorize the aggregated plurality of signals and a threat intelligence module configured to manage threats to the tenant's hosted environment. The method may also include providing one or more of raw signals, filtered signals at one or more correlation levels, and signals generated during the aggregation of the collected plurality of signals to one or more of the data explorer module and the threat intelligence module. The method may yet include detecting a pattern associated with and a usage of the collected plurality of signals; deriving an insight based on the pattern; and presenting the derived insight.

According to further examples, the method may also include receiving pre-correlated signals from a graph-based data correlation service. The collected plurality of signals may include one or more of documents, non-document content, communications, metadata, activities, organizational relationships, and configurations. The method may further include determining which type of collected signals to execute a received query on based on a request for the query. The method may also include determining how to augment the query based on a context of the request and/or aggregating query results based on a type requesting module of a security and compliance service. The type of the requesting module may include one of a data classification module, a threat management module, a policy management module, and an alert management module.

According to other examples, a server configured to provide a data insights platform for a security and compliance environment is described. The server may include a communication interface configured to facilitate communication between another server hosting a security and compliance service, one or more client devices, and the server; a memory configured to store instructions; and one or more processors coupled to the communication interface and the memory and configured to execute the data insights platform. The data insights platform may be configured to collect a plurality of signals from a plurality of resources within a tenant's hosted environment, where the collected plurality of signals are correlated at one or more levels based their content and context; receive a query associated with the collected plurality of signals; focus and filter the query on a portion of the collected plurality of signals based on a context of the query in relation to the collected plurality of signals; reply to the query with a comprehensive analysis report based on the focused and filtered execution of the query on the portion of the collected plurality of signals; and provide one or more of raw signals, filtered signals at one or more correlation levels, and signals generated during an aggregation of the collected plurality of signals to one or more of a data explorer module, an alert management module, and a threat intelligence module within the security and compliance service.

According to some examples, the data insights platform may include a reporting framework to manage replies to queries, an aggregation store to store aggregated signals, and a data insights application programming interface (API) to communicate with the data explorer module, the alert management module, and the threat intelligence module. The data insights platform may further include a background job framework configured to manage aggregation tasks associated with the data insights platform. The aggregation tasks may include one or more of alert aggregation, policy aggregation, threat intelligence aggregation, default aggregation, system policy aggregation, reporting aggregation, and user experience data aggregation. The user experience data aggregation may include customization insights and tenant usage insights. The plurality of signals may include documents, non-document content, communications, and activities and metadata associated with the documents, the non-document content, and the communications. The data insights platform may be configured to correlate and evaluate the documents, the non-document content, and the communications in context of corresponding activities and metadata associated with the documents, the non-document content, and the communications.

According to further examples, a computer-readable memory device with instructions stored thereon to provide a data insights platform for a security and compliance environment is described. The instructions, when executed, may be configured to cause one or more computing devices to perform actions that include collect a plurality of signals comprising documents, non-document content, communications, and activities and metadata associated with the documents, the non-document content, and the communications from a plurality of resources within a tenant's hosted environment, where the collected plurality of signals are correlated at one or more levels based their content and a context of corresponding activities and metadata associated with the documents, the non-document content, and the communications; receive a query associated with the collected plurality of signals; focus and filter the query on a portion of the collected plurality of signals based on a context of the query in relation to the collected plurality of signals; and reply to the query with a comprehensive analysis report based on the focused and filtered execution of the query on the portion of the collected plurality of signals.

According to yet other examples, the correlation may be based on one or more of a label of, a sensitive content within, a type of, an age of, a storage location of, a location of a user accessing, and an identity of a user or an entity accessing the documents, the non-document content, and the communications.

The above specification, examples and data provide a complete description of the manufacture and use of the composition of the embodiments. Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims and embodiments. 

What is claimed is:
 1. A method to provide a data insights platform for a security and compliance environment, the method comprising: collecting a plurality of signals from a plurality of resources within a tenant's hosted environment, wherein the collected plurality of signals are correlated at one or more levels based their content and context; receiving a query associated with the collected plurality of signals; focusing and filtering the query on a portion of the collected plurality of signals based on a context of the query in relation to the collected plurality of signals; and replying to the query with a comprehensive analysis report based on the focused and filtered execution of the query on the portion of the collected plurality of signals.
 2. The method of claim 1, further comprising: aggregating the plurality of signals in real time.
 3. The method of claim 2, further comprising: receiving the query from and replying to one or more of a data explorer module configured to identify and categorize the aggregated plurality of signals and a threat intelligence module configured to manage threats to the tenant's hosted environment.
 4. The method of claim 3, further comprising: providing one or more of raw signals, filtered signals at one or more correlation levels, and signals generated during the aggregation of the collected plurality of signals to one or more of the data explorer module and the threat intelligence module.
 5. The method of claim 1, further comprising: detecting a pattern associated with and a usage of the collected plurality of signals; deriving an insight based on the pattern; and presenting the derived insight.
 6. The method of claim 1, further comprising: receiving pre-correlated signals from a graph-based data correlation service.
 7. The method of claim 1, wherein the collected plurality of signals comprise one or more of documents, non-document content, communications, metadata, activities, organizational relationships, and configurations.
 8. The method of claim 1, further comprising: determining which type of collected signals to execute a received query on based on a request for the query.
 9. The method of claim 8, further comprising: determining how to augment the query based on a context of the request.
 10. The method of claim 1, further comprising: aggregating query results based on a type requesting module of a security and compliance service.
 11. The method of claim 10, wherein the type of the requesting module includes one of a data classification module, a threat management module, a policy management module, and an alert management module.
 12. A server configured to provide a data insights platform for a security and compliance environment, the server comprising: a communication interface configured to facilitate communication between another server hosting a security and compliance service, one or more client devices, and the server; a memory configured to store instructions; and one or more processors coupled to the communication interface and the memory and configured to execute the data insights platform, wherein the data insights platform is configured to: collect a plurality of signals from a plurality of resources within a tenant's hosted environment, wherein the collected plurality of signals are correlated at one or more levels based their content and context; receive a query associated with the collected plurality of signals; focus and filter the query on a portion of the collected plurality of signals based on a context of the query in relation to the collected plurality of signals; reply to the query with a comprehensive analysis report based on the focused and filtered execution of the query on the portion of the collected plurality of signals; and provide one or more of raw signals, filtered signals at one or more correlation levels, and signals generated during an aggregation of the collected plurality of signals to one or more of a data explorer module, an alert management module, and a threat intelligence module within the security and compliance service.
 13. The server of claim 12, wherein the data insights platform includes a reporting framework to manage replies to queries, an aggregation store to store aggregated signals, and a data insights application programming interface (API) to communicate with the data explorer module, the alert management module, and the threat intelligence module.
 14. The server of claim 12, further comprising a background job framework configured to manage aggregation tasks associated with the data insights platform.
 15. The server of claim 14, wherein the aggregation tasks include one or more of alert aggregation, policy aggregation, threat intelligence aggregation, default aggregation, system policy aggregation, reporting aggregation, and user experience data aggregation.
 16. The server of claim 15, wherein the user experience data aggregation includes customization insights and tenant usage insights.
 17. The server of claim 12, wherein the plurality of signals include documents, non-document content, communications, and activities and metadata associated with the documents, the non-document content, and the communications.
 18. The server of claim 17, wherein the data insights platform is configured to correlate and evaluate the documents, the non-document content, and the communications in context of corresponding activities and metadata associated with the documents, the non-document content, and the communications.
 19. A computer-readable memory device with instructions stored thereon to provide a data insights platform for a security and compliance environment, the instructions, when executed, configured to cause one or more computing devices to perform actions comprising: collect a plurality of signals comprising documents, non-document content, communications, and activities and metadata associated with the documents, the non-document content, and the communications from a plurality of resources within a tenant's hosted environment, wherein the collected plurality of signals are correlated at one or more levels based their content and a context of corresponding activities and metadata associated with the documents, the non-document content, and the communications; receive a query associated with the collected plurality of signals; focus and filter the query on a portion of the collected plurality of signals based on a context of the query in relation to the collected plurality of signals; and reply to the query with a comprehensive analysis report based on the focused and filtered execution of the query on the portion of the collected plurality of signals.
 20. The computer-readable storage medium of claim 19, wherein the correlation is based on one or more of a label of a sensitive content within, a type of, an age of, a storage location of, a location of a user accessing, and an identity of a user or an entity accessing the documents, the non-document content, and the communications. 